Search CORE

598 research outputs found

A continuous analog of run length distributions reflecting accumulated fractionation events

Author: B Wang
C Zheng
D Sankoff
D Sankoff
D Sankoff
David Sankoff
JK Byrnes
MJ van Hoek
Zhe Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Matching Sequences under Deletion/Insertion Constraints

Author: D. Sankoff
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date
Field of study

Crossref

The median problem for breakpoints in comparative genomics

Author: D. Sankoff
D. Sankoff
D. Sankoff
D. Sankoff
D. Sankoff
E. Minieka
J. Kececioglu
J. Kececioglu
J.H. Nadeau
M. Blanchette
V. Bafna
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Guided genome halving: hardness, heuristics and the history of the Hemiascomycetes

Author: Bourque
C. Zheng
Caprara
D. Sankoff
Dujon
GOWER
Kurtzman
Q. Zhu
Sankoff
Wolfe
Z. Adam
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

Motivation: Some present day species have incurred a whole genome doubling event in their evolutionary history, and this is reflected today in patterns of duplicated segments scattered throughout their chromosomes. These duplications may be used as data to ‘halve’ the genome, i.e. to reconstruct the ancestral genome at the moment of doubling, but the solution is often highly nonunique. To resolve this problem, we take account of outgroups, external reference genomes, to guide and narrow down the search

CiteSeerX

Crossref

PubMed Central

Algorithmic and Hardness Results for the Colorful Components Problems

Author: A. Avidor
A. Paz
C. Zheng
D. Sankoff
M. Bellare
S. Bruckner
S. Bruckner
U. Feige
Publication venue
Publication date: 06/11/2013
Field of study

In this paper we investigate the colorful components framework, motivated by applications emerging from comparative genomics. The general goal is to remove a collection of edges from an undirected vertex-colored graph

G

such that in the resulting graph

G'

all the connected components are colorful (i.e., any two vertices of the same color belong to different connected components). We want

G'

to optimize an objective function, the selection of this function being specific to each problem in the framework. We analyze three objective functions, and thus, three different problems, which are believed to be relevant for the biological applications: minimizing the number of singleton vertices, maximizing the number of edges in the transitive closure, and minimizing the number of connected components. Our main result is a polynomial time algorithm for the first problem. This result disproves the conjecture of Zheng et al. that the problem is

NP

-hard (assuming

P \neq NP

). Then, we show that the second problem is

APX

-hard, thus proving and strengthening the conjecture of Zheng et al. that the problem is

NP

-hard. Finally, we show that the third problem does not admit polynomial time approximation within a factor of

|V|^{1/14 - \epsilon}

for any

\epsilon > 0

, assuming

P \neq NP

(or within a factor of

|V|^{1/2 - \epsilon}

, assuming

ZPP \neq NP

).Comment: 18 pages, 3 figure

arXiv.org e-Print Archive

Crossref

Power Boosts for Cluster Tests

Author: D. Durand
D. Sankoff
D. Sankoff
P.P. Calabrese
R. Hoberman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Abstract. Gene cluster significance tests that are based on the num-ber of genes in a cluster in two genomes, and how compactly they are distributed, but not their order, may be made more powerful by the ad-dition of a test component that focuses solely on the similarity of the ordering of the common genes in the clusters in the two genomes. Here we suggest four such tests, compare them, and investigate one of them, the maximum adjacency disruption criterion, in some detail, analytically and through simulation.

CiteSeerX

Crossref

Chromosomal Breakpoint Reuse in Genome Sequence Rearrangement

Author: David Sankoff
Hannenhalli S.
Pevzner P.A.
Phil Trinh
Sankoff D.
Sankoff D.
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Crossref

Parking functions, labeled trees and DCJ sorting scenarios

Author: A. Bergeron
A. McLysaght
A.C. Siepel
A.G. Konheim
A.W. Xu
D. Sankoff
D. Sankoff
E. Barcucci
I. Miklós
I. Miklós
M. Ozery-flato
M.D.V. Braga
P. Pevzner
R.P. Stanley
R.P. Stanley
R.P. Stanley
S. Bérard
S. Yancopoulos
Y. Ajana
Y. Diekmann
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

In genome rearrangement theory, one of the elusive questions raised in recent years is the enumeration of rearrangement scenarios between two genomes. This problem is related to the uniform generation of rearrangement scenarios, and the derivation of tests of statistical significance of the properties of these scenarios. Here we give an exact formula for the number of double-cut-and-join (DCJ) rearrangement scenarios of co-tailed genomes. We also construct effective bijections between the set of scenarios that sort a cycle and well studied combinatorial objects such as parking functions and labeled trees.Comment: 12 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

A framework for orthology assignment from gene rearrangement data

Author: A. Caprara
B. Larget
B.M.E. Moret
C. Thach Nguyen
D. Bryant
D. Sankoff
D. Sankoff
D. Sankoff
D. Sankoff
D.A. Bader
G. Tesler
J. Earnest-DeYoung
J. Tang
J.L. Boore
J.L. Boore
K.M. Swenson
M. Blanchette
M. Marron
M.E. Cosner
N. El-Mabrouk
N. El-Mabrouk
S. Hannenhalli
S.R. Downie
X. Chen
Publication venue: Springer
Publication date: 01/01/2005
Field of study

Abstract. Gene rearrangements have successfully been used in phylogenetic reconstruction and comparative genomics, but usually under the assumption that all genomes have the same gene content and that no gene is duplicated. While these assumptions allow one to work with organellar genomes, they are too restrictive when comparing nuclear genomes. The main challenge is how to deal with gene families, specifically, how to identify orthologs. While searching for orthologies is a common task in computational biology, it is usually done using sequence data. We approach that problem using gene rearrangement data, provide an optimization framework in which to phrase the problem, and present some preliminary theoretical results.

CiteSeerX

Crossref

On the PATHGROUPS approach to rapid small phylogeny

Author: A Caprara
AC Siepel
AW Xu
C Zheng
Chunfang Zheng
D Sankoff
D Sankoff
David Sankoff
E Tannier
G Fertin
KP Byrne
N El-Mabrouk
R Warren
S Yancopoulos
SM Hedtke
Z Adam
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

We present a data structure enabling rapid heuristic solution to the ancestral genome reconstruction problem for given phylogenies under genomic rearrangement metrics. The efficiency of the greedy algorithm is due to fast updating of the structure during run time and a simple priority scheme for choosing the next step. Since accuracy deteriorates for sets of highly divergent genomes, we investigate strategies for improving accuracy and expanding the range of data sets where accurate reconstructions can be expected. This includes a more refined priority system, and a two-step look-ahead, as well as iterative local improvements based on a the median version of the problem, incorporating simulated annealing. We apply this to a set of yeast genomes to corroborate a recent gene sequence-based phylogeny

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central